56 research outputs found
Improving SIEM for critical SCADA water infrastructures using machine learning
Network Control Systems (NAC) have been used in many industrial processes. They aim to reduce the human factor burden and efficiently handle the complex process and communication of those systems. Supervisory control and data acquisition (SCADA) systems are used in industrial, infrastructure and facility processes (e.g. manufacturing, fabrication, oil and water pipelines, building ventilation, etc.) Like other Internet of Things (IoT) implementations, SCADA systems are vulnerable to cyber-attacks, therefore, a robust anomaly detection is a major requirement. However, having an accurate anomaly detection system is not an easy task, due to the difficulty to differentiate between cyber-attacks and system internal failures (e.g. hardware failures). In this paper, we present a model that detects anomaly events in a water system controlled by SCADA. Six Machine Learning techniques have been used in building and evaluating the model. The model classifies different anomaly events including hardware failures (e.g. sensor failures), sabotage and cyber-attacks (e.g. DoS and Spoofing). Unlike other detection systems, our proposed work helps in accelerating the mitigation process by notifying the operator with additional information when an anomaly occurs. This additional information includes the probability and confidence level of event(s) occurring. The model is trained and tested using a real-world dataset
Machine learning based IoT Intrusion Detection System:an MQTT case study (MQTT-IoT-IDS2020 Dataset)
The Internet of Things (IoT) is one of the main research fields in the Cybersecurity domain. This is due to (a) the increased dependency on automated device, and (b) the inadequacy of general-purpose Intrusion Detection Systems (IDS) to be deployed for special purpose networks usage. Numerous lightweight protocols are being proposed for IoT devices communication usage. One of the distinguishable IoT machine-to-machine communication protocols is Message Queuing Telemetry Transport (MQTT) protocol. However, as per the authors best knowledge, there are no available IDS datasets that include MQTT benign or attack instances and thus, no IDS experimental results available. In this paper, the effectiveness of six Machine Learning (ML) techniques to detect MQTT-based attacks is evaluated. Three abstraction levels of features are assessed, namely, packet-based, unidirectional flow, and bidirectional flow features. An MQTT simulated dataset is generated and used for the training and evaluation processes. The dataset is released with an open access licence to help the research community further analyse the accompanied challenges. The experimental results demonstrated the adequacy of the proposed ML models to suit MQTT-based networks IDS requirements. Moreover, the results emphasise on the importance of using flow-based features to discriminate MQTT-based attacks from benign traffic, while packet-based features are sufficient for traditional networking attacks
MOCDroid: multi-objective evolutionary classifier for Android malware detection
Malware threats are growing, while at the same time, concealment strategies are being used to make them undetectable for current commercial Anti-Virus. Android is one of the target architectures where these problems are specially alarming, due to the wide extension of the platform in different everyday devices.The detection is specially relevant for Android markets in order to ensure that all the software they offer is clean, however, obfuscation has proven to be effective at evading the detection process. In this paper we leverage third-party calls to bypass the effects of these concealment strategies, since they cannot be obfuscated. We combine clustering and multi-objective optimisation to generate a classifier based on specific behaviours defined by 3rd party calls groups. The optimiser ensures that these groups are related to malicious or benign behaviours cleaning any non-discriminative pattern. This tool, named MOCDroid, achieves an ac-curacy of 94.6% in test with 2.12% of false positives with real apps extracted from the wild, overcoming all commercial Anti-Virus engines from VirusTotal
SiteSeek: Post-translational modification analysis using adaptive locality-effective kernel methods and new profiles
<p>Abstract</p> <p>Background</p> <p>Post-translational modifications have a substantial influence on the structure and functions of protein. Post-translational phosphorylation is one of the most common modification that occur in intracellular proteins. Accurate prediction of protein phosphorylation sites is of great importance for the understanding of diverse cellular signalling processes in both the human body and in animals. In this study, we propose a new machine learning based protein phosphorylation site predictor, SiteSeek. SiteSeek is trained using a novel compact evolutionary and hydrophobicity profile to detect possible protein phosphorylation sites for a target sequence. The newly proposed method proves to be more accurate and exhibits a much stable predictive performance than currently existing phosphorylation site predictors.</p> <p>Results</p> <p>The performance of the proposed model was compared to nine existing different machine learning models and four widely known phosphorylation site predictors with the newly proposed PS-Benchmark_1 dataset to contrast their accuracy, sensitivity, specificity and correlation coefficient. SiteSeek showed better predictive performance with 86.6% accuracy, 83.8% sensitivity, 92.5% specificity and 0.77 correlation-coefficient on the four main kinase families (CDK, CK2, PKA, and PKC).</p> <p>Conclusion</p> <p>Our newly proposed methods used in SiteSeek were shown to be useful for the identification of protein phosphorylation sites as it performed much better than widely known predictors on the newly built PS-Benchmark_1 dataset.</p
Spatio-Temporal Features of Visual Exploration in Unilaterally Brain-Damaged Subjects with or without Neglect: Results from a Touchscreen Test
Cognitive assessment in a clinical setting is generally made by pencil-and-paper tests, while computer-based tests enable the measurement and the extraction of additional performance indexes. Previous studies have demonstrated that in a research context exploration deficits occur also in patients without evidence of unilateral neglect at pencil-and-paper tests. The objective of this study is to apply a touchscreen-based cancellation test, feasible also in a clinical context, to large groups of control subjects and unilaterally brain-damaged patients, with and without unilateral spatial neglect (USN), in order to assess disturbances of the exploratory skills. A computerized cancellation test on a touchscreen interface was used for assessing the performance of 119 neurologically unimpaired control subjects and 193 patients with unilateral right or left hemispheric brain damage, either with or without USN. A set of performance indexes were defined including Latency, Proximity, Crossings and their spatial lateral gradients, and Preferred Search Direction. Classic outcome scores were computed as well. Results show statistically significant differences among groups (assumed p<0.05). Right-brain-damaged patients with USN were significantly slower (median latency per detected item was 1.18 s) and less efficient (about 13 search-path crossings) in the search than controls (median latency 0.64 s; about 3 crossings). Their preferred search direction (53.6% downward, 36.7% leftward) was different from the one in control patients (88.2% downward, 2.1% leftward). Right-brain-damaged patients without USN showed a significantly abnormal behavior (median latency 0.84 s, about 5 crossings, 83.3% downward and 9.1% leftward direction) situated half way between controls and right-brain-damaged patients with USN. Left-brain-damaged patients without USN were significantly slower and less efficient than controls (latency 1.19 s, about 7 crossings), preserving a normal preferred search direction (93.7% downward). Therefore, the proposed touchscreen-based assessment had evidenced disorders in spatial exploration also in patients without clinically diagnosed USN
Attachment, infidelity, and loneliness in college students involved in a romantic relationship: the role of relationship satisfaction, morbidity and prayer for partner
This study examined the mediating effects of relationship satisfaction, prayer
for a partner, and morbidity in the relationship between attachment and loneliness, infidelity
and loneliness, and psychological morbidity and loneliness, in college students
involved in a romantic relationship. Participants were students in an introductory course on
family development. This study examined only students (n = 345) who were involved in a
romantic relationship. The average age of participants was 19.46 (SD = 1.92) and 25 %
were males. Short-form UCLA Loneliness Scale (ULS-8), (Hays and DiMatteo in J Pers
Assess 51:69–81, doi:10.1207/s15327752jpa5101_6, 1987); Relationship Satisfaction
Scale (Funk and Rogge in J Fam Psychol 21:572–583, doi:10.1037/0893-3200.21.4.572,
2007); Rotterdam Symptom Checklist (De Haes et al. in Measuring the quality of life of
cancer patients with the Rotterdam Symptom Checklist (RSCL): a manual, Northern
Centre for Healthcare Research, Groningen, 1996); Prayer for Partner Scale, (Fincham
et al. in J Pers Soc Psychol 99:649–659, doi:10.1037/a0019628, 2010); Infidelity Scale,
(Drigotas et al. in J Pers Soc Psychol 77:509–524, doi:10.1037/0022-3514.77.3.509, 1999);
and the Experiences in Close Relationship Scale-short form (Wei et al. in J Couns Psychol
52(4):602–614, doi:10.1037/0022-0167.52.4.602, 2005). Results showed that relationship
satisfaction mediated the relationship between avoidance attachment and loneliness and
between infidelity and loneliness. Physical morbidity mediated the relationship between
anxious attachment and psychological morbidity. Psychological morbidity mediated the
relationship between anxious attachment and physical morbidity. The present results
expand the literature on attachment by presenting evidence that anxious and avoidant partners experience loneliness differently. Implications for couple’s therapy are addressed.
Future research should replicate these results with older samples and married couples.Acknowledgments This research was supported by Grant Number 90FE0022 from the United States
Department of Health and Human Services awarded to the last author
Unsupervised record matching with noisy and incomplete data
We consider the problem of duplicate detection in noisy and incomplete data: given a large data set in which each record has multiple entries (attributes), detect which distinct records refer to the same real world entity. This task is complicated by noise (such as misspellings) and missing data, which can lead to records being different, despite referring to the same entity. Our method consists of three main steps: creating a similarity score between records, grouping records together into "unique entities", and refining the groups. We compare various methods for creating similarity scores between noisy records, considering different combinations of string matching, term frequency-inverse document frequency methods, and n-gram techniques. In particular, we introduce a vectorized soft term frequency-inverse document frequency method, with an optional refinement step. We also discuss two methods to deal with missing data in computing similarity scores.
We test our method on the Los Angeles Police Department Field Interview Card data set, the Cora Citation Matching data set, and two sets of restaurant review data. The results show that the methods that use words as the basic units are preferable to those that use 3-grams. Moreover, in some (but certainly not all) parameter ranges soft term frequency-inverse document frequency methods can outperform the standard term frequency-inverse document frequency method. The results also confirm that our method for automatically determining the number of groups typically works well in many cases and allows for accurate results in the absence of a priori knowledge of the number of unique entities in the data set
Visualizing Big Data with augmented and virtual reality: challenges and research agenda
This paper provides a multi-disciplinary overview of the research issues and achievements in the field of Big Data and its visualization techniques and tools. The main aim is to summarize challenges in visualization methods for existing Big Data, as well as to offer novel solutions for issues related to the current state of Big Data Visualization. This paper provides a classification of existing data types, analytical methods, visualization techniques and tools, with a particular emphasis placed on surveying the evolution of visualization methodology over the past years. Based on the results, we reveal disadvantages of existing visualization methods. Despite the technological development of the modern world, human involvement (interaction), judgment and logical thinking are necessary while working with Big Data. Therefore, the role of human perceptional limitations involving large amounts of information is evaluated. Based on the results, a non-traditional approach is proposed: we discuss how the capabilities of Augmented Reality and Virtual Reality could be applied to the field of Big Data Visualization. We discuss the promising utility of Mixed Reality technology integration with applications in Big Data Visualization. Placing the most essential data in the central area of the human visual field in Mixed Reality would allow one to obtain the presented information in a short period of time without significant data losses due to human perceptual issues. Furthermore, we discuss the impacts of new technologies, such as Virtual Reality displays and Augmented Reality helmets on the Big Data visualization as well as to the classification of the main challenges of integrating the technology.publishedVersionPeer reviewe
- …